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ABSTRACT 



The general usefulness of selected predictions 



equations for computer simulated scoring of creativity tests was 
studied. This was carried out by testing previously established 
prediction equations for samples drawn from similar populations. 
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Scoring Creativity Testa by Computer Simulation: 
A Validation of Prediction Equations 



John F, Greene 
University of Bridgeport 



Perry A. Zirkel 
University of Hartford 



In recent years several studies have attempted to utilize the 
computer to simulate human behavior, especially human rating behavior 
(e.g. Page and Paulus, 1968; Archambault, 1969; Greene, 1970; Whalen, 
1970) . These attempts included scoring essays and creativity tests 
by computer. In each study, the researcher developed a multiple 
regression prediction equation and then proceeded to either empir- 
ically or statistically cross-validate the equation within a randomly 
selected partition of the original sample. None of the prediction 
equations, however, have actually been employed and then evaluated in 
samples drawn from other similar populations. Consequently, the ob- 
jective of this study was to determine the general usefulness of selected 
prediction equations, 

Greene (1970) developed successful computer prediction models 
for activities four through seven of the Torrance Tes ts of Creative 
Thinking, Verbal-Form A, (TTCT) (Torrance, 1966), Activities four 
through seven are Product Improvement (toy elephant) , Unusual Uses 
(cardboard boxes), Unusual Questions (cardboard boxes), and Just Sup- 
pose (if clouds had strings, what would happen?) respectively. Each 
activity is scored for three dimensions of creativity: fluency, flex- 

ibility, and originality, A flexibility score, however, is not 

*This research paper extends the principle author's doctoral 
dissertation. The study was initiated while the author was a USOE 
research fellow at The University of Connecticut. Appreciation is 
expressed to Dieter Paulus and Joseph Renzulll for assistance. 



2 



determined for the sixth activity, Unusual Questions* In Greene's (1970) 
study four judges rated the responses of 153 subjects* Analysis of 
variance procedures were used to provide an estimate of the pooled ratings 
of the judges, as suggested by Winer, (1962, pp* 124-132)* A step- 
wise multiple regression technique was emplcyed to maximize the prediction 
of each subject's scores for each activity* Die predictors included the 
actuarial and dictionary parameters generated earlier by computer* 

Besides the full model, restricted and forced regression models were 
generated. The entire computerized scoring procedure was then evaluated 
in a cross-validation sample* 



METHOD 

In the present study, the TTCT was administered to 190 students in 
2 central Connecticut suburban schools* These students were in grades 
4-7 and represented a sample similar to the Mew York sample used in the 
original investigation* The responses of each subject were Independently 
rated by four judges* Adjusted reliability estimates of these pooled 
ratings were determined by analysis of variance procedures* The predic- 
tion equations developed by Greene (1970) were then used to evaluate the 
student responses* In order to determine the usefulness of the prediction 
equations, the computer- genera ted scores for the full, restricted, and 
forced models were correlated with the human ratings* These correlations 
were attenuated for the unreliable aspects of the criterion variable* 
Shrinkage was determined by contrasting the correlation coefficients with 
those reputed for the cross-validation sample in the original study. 
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RESULTS /CONCLUSIONS 

The adjusted pooled reliability estimates are presented In 
Table I. 



TABLE I 

ADJUSTED POOLED RELIABILITY ESTIMATES FOR 
FOUR JUDGES USING ANALYSIS OF VARIANCE 



Activity 


Fluency 


Dimension 

Flexibility 


Originality 


4 


.97 


.95 


.86 


5 


.98 


.96 


.90 


6 


.98 


— 


.82 


7 


.98 


.89 


.81 



**A11 correlations significant at *01 level* 



These estimates ranged from *97 to .98 for fluency and *89 to *96 for 
the flexibility ratings i The originality coefficients, ranging from 
•81 to *90 for the four activities, were noticeably higher than those 
found in the original study* 

The attenuated correlations between the computer generated scores 
and the human ratings for the full, restricted, and forced models of 
the validation sample as well as the results of the original study are 
given in Table II* 
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The four full model correlations In the new validation sample for 
fluency exceeded .86. Shrinkage was limited to *08. Lower corre- 
lations, ranging from .68 to .72, and greater shrinkage, up to .21, 
were obtained for the flexibility dimension. Although shrinkage was 
noted, the full model prediction equations for fluency and flexibility 
were deemed stable and useful. 

The originality coefficients were bound by ,42 and ,70, with 
shrinkage ranging from .08 to .40. Except for Activity five, these 
results indicate that the original full model prediction equations 
for originality are relatively unstable. 

The results of the restricted and forced models paralleled those 
of the full model. Although higher coefficients were generally found, 
appreciable differences in predictive ability were not realized. 
Consequently, the usefulness of these models is restricted to par- 
simonious considerations. 

While computer simulation models continue to appeal to the 
fancy of man’s mind, the usefulness of these models after initial 
development must be continually considered. Otherwise, such models 
will be considered an mere academic games. 
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